
The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STIC) detected errors when processing the following computer readable 
form: 

Application Serial Number: o?A>7/. cm RECEIVED 

Source: AYiT JUL 1 1 2001 

Date Processed by STIC: _ bfeo/soef TECH 1600/29C0 

THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. 

PLEASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

1) INCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT COMMUNICATION TO THE 
APPLICANT, WITH A NOTICE TO COMPLY or, 

2) TELEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 
NOTICE TO COMPLY 

FOR CRF SUBMISSION QUESTIONS, PLEASE CONTACT MARK SPENCER, 703=308-4212. 

FOR SEQUENCE RULES INTERPRETATION, PLEASE CONTACT ROBERT WAX, 703-308-4216. 
PATENTTN 2.1 e-mail help: patin21help(g>,uspto.gov or phone 703-306-4119 (R. Wax) 
PATENTIN 3.0 e-mail help: patin3help@uspto.gov or phone 703-306-4119 (R. Wax) 

TO REDUCE ERRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
VERSION 3.0 PROGRAM. ACCESSIBLE THROUGH THE U.S. PATENT AND 
TRADEMARK OFFICE WEBSITE. SEE BELOW: 



\' Checker Version 3.0 

The Checker Version 3 .0 application is a state-of the-art Windows based software program 
, employing a logical and intuitive user-interface to check whether a sequence listing is in 
^compliance with format and content rules. Checker Version 3.0 works for sequence listings 
generated for the original version of 37 CFR §§1.821 - 1.825 effective October 1, 1990 (old 
rules) and the revised version (new rules) effective July 1, 1998 as well as World Intellectual 
Property Organization (WIPO) Standard ST.25. . 

-Checker Version -3,0 replaces-theprevious DOS-based version of Checker,_and is Y2K- 

compliant. Checker allows public users to check sequence listings in Computer Readable form 
(CRF) before submitting them to the United States Patent and Trademark Office (USPTOV 
Use of Checker prior to filing the sequence listing is expected to result in fewer errored sequence 
listings, thus saving time and money. 



Checker Version 3.0 can be down loaded from the USPTO website at the following address: 

http://www.uspto.gov/web/offices/pac/checker 



Raw Sequence Listing Error Summary 



RECEIVED 

JUL 1 1 2001 
TECH CENTER 1 600/2900 



ERROR DETECTED SUGGESTED CORRECTION 
ATTN: NEW RULES CASES: 



SERIAL NUMBER: 



1 Wrapped Nuclcics 

Wrapped Aminos 



PLEASE DISREGARD ENGLISH -ALPHA" HEADERS, WHICH WERE INSERTED BY PTO SOFTWARE 

The number/text al the end of each line "wrapped" down to the next line. This may occur if your file 
was retrieved in a word processor after creating it. Please adjust your right margin to .3; this will 
prevent "wrapping." 



Invalid Line Length The rules require that a line not exceed 72 characters in length. This includes white spaces. 



3 Misaligned Amino 

Numbering 

4 Non-ASCII 



The numbering under each 5* amino acid is misaligned. Do not use tab codes between numbers; 
use space characters, instead. 

The submitted file was not saved in ASCII(DOS) text, as required by the Sequence Rules. Please 
ensure your subsequent submission is saved in ASCII text 



_ Variable Length Sequcncc(s) contain n's or Xaa's representing more than one residue. Per Sequence Rules, 

each n or Xaa can only represent a single residue. Please present the maximum number of each 
residue having variable length and indicate in the <220>-<223> section that some may be missing. 



Patcntln 2.0 
"bug" 



7 Skipped Sequences 

(OLD RULES) 



A "bug" in Patcntln version 2.0 has caused the <220>-<223> section to be missing from amino acid 

scqucnces(s) . Normally, Patcntln would automatically generate this section from the 

previously coded nucleic acid sequence. Please manually copy the relevant <220>-<223> section to 
the subsequent amino acid sequence. This applies to the mandatory <220>-<223> sections for 
Artificial or Unknown sequences. 

Scqucncc(s) missing. If intentional, please insert the following lines for each skipped sequence: 

(2) INFORMATION FOR SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
(i) SEQUENCE CHARACTERISTICS: (Do not insert any subheadings under this heading) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
This sequence is intentionally skipped 

Please also adjust the "(ii) NUMBER OF SEQUENCES:" response to include the skipped sequences. 



Skipped Sequences 
(NEW RULES) 



Sequencers) 



missing. If intentional, please insert the following lines for each skipped sequence. 



<210> sequence id number 
<400> sequence id number 
000 



Use of.n's or .Xaa's Use of n*s and/or Xaa's have been detected in the Sequence Listing. 



J 



(NEW RULES) 



10 ^ Invalid <213> 
Response 



Per 1 .823 of Sequence Rules, use of <220>-<223> is MANDATOR Y if n*s or Xaa's are present. — - 
In <220> to <223> section, please explain location of n or Xaa, and which residue n or Xaa represents. 

Per 1.823 of Sequence Rules, the only valid <21 3> responses arc: Unknown, Artificial Sequence, or 
scientific name (Gcnus/spccics). <220>-<223> section is required when <213> response is Unknown or 
is Artificial Sequence 



11 



Use of<220> 



Sequencers) 



missing the <220> "Feature" and associated numeric identifiers and responses. 



12 



_PatcntIn 2.0 
"bug" 



Use of <220> to <223> is MANDATORY if <2 1 3> "Organism" response is "Artificial Sequence" or 
"Unknown." Please explain source of genetic material in <220> to <223> section. 
(See Te^defaJ Register," 06/01/1998^ 

Please do not use "Copy to Disk" function of Patenlln version 2.0. This causes a corrupted file, 
resulting in missing mandatory numeric identifiers and responses (as indicated on raw sequence 
listing). Instead, please use "File Manager" or any other manual means to copy file to floppy disk. 



AMC - Biotechnology Systems Branch - 06704/2001 



Deceived 

JUL 1 i 2001 
TECH CENTER 1600/2300 



Page 1 of 7 



1645 



RAW SEQUENCE LISTING DATE: 0 

PATENT APPLICATION: US/09/671, 635A TIME: 

Input Set : N:\COPIES\ES.txt 

Output Set: N:\CRF3\06202001\I671635A.raw 



6/20/2001 

11:08:56 JLrf /^S 



iff 

Does Not Comply 
Corrected Diskette Needed 



3 <110> APPLICANT: ALEXANDROV, Nickolai et al . 

5 <120> TITLE OF INVENTION: SEQUENCE -DETERMINED DNA FRAGMENTS AND CORRESPONDING 
POLYPEPTIDES ENCODED 



6 
8 
10 
11 
13 
15 
17 
18 
19 
20 
22 
24 
25 
27 
28 
29 
30 
32 
34 
35 
37 
38 
39 
40 
42 
44 
4-5- 
47 
48 
49 
50 
52 



57 

_J>! 
59 
60 
62 
64 
65 
67 
68 
69 
70 



THEREBY 

<130> FILE REFERENCE: 2750-1026P 

<140> CURRENT APPLICATION NUMBER: US 09/671, 635A 
<141> CURRENT FILING DATE: 2000-09-28 
<160> NUMBER OF SEQ ID NOS : 802 
<170> SOFTWARE: Patentln version 3.0 
<210> SEQ ID NO: 1 
<211> LENGTH: 4 
<212> TYPE: PRT 

<213> ORGANISM :£ Consensus Sequenc 
<400> SEQUENCE 
Ala Gly Cys Asn 
1 

<210> SEQ ID NO: 
<211> LENGTH: 4 
<212> TYPE: PRT 
<213> ORGANISM^ 
<400> SEQUENCE: 
Ala Gly He Met 
1 

SEQ ID NO: 
LENGTH: 4 
TYPE: PRT 
ORGANISM 
SEQUENCE: 
Ala Gly Leu He 

1- 

SEQ ID NO: 
LENGTH: 4 
TYPE: PRT 
ORGANISM t 
SEQUENCE: 




<210> 
<211> 
<212> 
<213> 
<400> 



<210> 
<211> 
<212> 
<213> 
<400> 





Consensus 
4" 



54 Ala Gly Leu Met 

55 1 

<210> SEQ ID NO 
<_2JLL>_ LEN GTH : _5 
<212> TYPE: PRT, 
<213> ORGANISM: 



<400> SEQUENCE: 
Ala Gly Ser Cys 
1 

<210> SEQ ID NO 
<211> LENGTH: 5 
<212> TYPE: PR 
<213> ORGANIS 




sQonsensus Sequence^ 
5 

He 
5 

6 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/67 1, 635A 



DATE: 06/20/2001 
TIME: 11:08:56 



Input Set : N:\COPIES\ES.txt 

Output Set: N:\CRF3\06202001\l671635A.raw 



72 
74 
75 
77 
78 
79 
80 
82 
84 
85 
87 
88 
89 
90 
92 
94 
95 
97 
98 
99 



<400> SEQUENCE: 6 
Ala Gly Ser Asp Met 
1 5 
<210> 
<211> 
<212> 
<213> 
<400> 



SEQ ID NO: 7 
LENGTH : 4 
TYPE : PRT 

ORGANISM yfohsensus 
SEQUENCE S-i . — 




Ala lie Val Pro 
1 

<210> 
<211> 
<212> 
<213> 
<400> 



SEQ ID NO: 
LENGTH: 4 
TYPE: PRT 
ORGANISM :, 
SEQUENCE : 




Consensus Sequg. 




Ala Leu He Val 
1 

<210> SEQ ID NO: 9 
<211> LENGTH: 4 
<212> TYPE: PRT 
100 <213> ORGANISM :(con 
102 <400> SEQUENCE: 9^— 
104 Ala Pro Asn Thr 
1 

<210> SEQ ID NO: 10 
<211> LENGTH: 5 
<212> TYPE: PRT 
<213> ORGANISM 
<400> SEQUENCE: 



Sequence 



^e^ 



105 
107 
108 
109 
110 
112 
114 
115 
117 
118 
119 
"120 



Ala Ser Leu Val 
1 

<210> SEQ ID NO 
<211> LENGTH: 5 
<212> TYPE: PRT 
<213> ORGANISM : 



Consensus 

12 2 <4 00> SEQUENCE: 

124 Ala Ser Thr Asp Val 

125 1 5 
<210> SEQ ID NO: 12 
<211> LENGTH: 5 
<212> TYPE: PR^ 
<213> ORGANISb(: Consensus 

132 <400> Q^nnFMPP^r^-^ 

-1-34- Ala -Ser-Thr— Pro- Val 

135 1 5 

<210> SEQ ID NO: 13 
<211> LENGTH: 5 
<212> TYPE: PRT 
<213> ORGANISM/ Consensus 




127 
128 
129 
130 



137 
138 
139 
140 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/67 I, 635A 



DATE: 06/20/2001 
TIME: 11:08:56 



Input Set : N:\COPIES\ES.txt 

Output Set: N:\CRF3\06202001\I671635A.raw 



142 
144 
145 
147 
148 
149 
150 
152 
154 
155 
157 
158 
159 
160 
162 
164 
165 
167 
168 
169 
170 
172 
174 
175 
177 
178 
179 
180 
182 
184 
185 
187 
188 
189 
T90 
192 
194 
195 
197 
198 
199 
200 
202 
-204- 
205 
207 
208 
209 
210 



ler^e 



<400> SEQUENCE: 13 
Ala Val Asn His Lys 
1 • 5 
<210> SEQ ID NO: 14 
<211> LENGTH: 5 
<212> TYPE: PRT 

<213> ORGANISM ^Cbnsensus Sequence 
<400> SEQUENCE: 14 
Cys Ser Ala Gly Asn 
1 5 
<210> 
<211> 
<212> 
<213> 
<400> 



SEQ ID NO 
LENGTH : 4 
TYPE: PRT 
ORGANISM 
SEQUENCE :' 



15 




Consensus ^Sequence 




Cys Ser Ala Met 
1 

<210> 
<211> 
<212> 
<213> 
<400> 



SEQ ID NO 
LENGTH : 4 
TYPE: PRT 
ORGANISM 
SEQUENCE : 



16 



Consensus Sequence 




Cys Ser Ala Val 
1 

<210> 
<211> 
<212> 
<213> 
<400> 



SEQ ID NO: 
LENGTH : 4 
TYPE : PRT 
ORGANISM: 
SEQUENCE : 



17 



Cys Ser Thr Ala 
1 

<210> 
<211> 
<212> 
<213> 
<400> 




SEQ ID NO: 
LENGTH: 6 
TYPE: PRT 
ORGANISM '} 
SEQUENCE: 



18 



Cys Ser Thr Ala 
1 

<210> 
<211> 
<212> 
<213> 
<400> 



SEQ ID NO 
LENGTH: 5 
TYPE : PRT 
ORGANISM 
SEQUENCE 




Asp- Ala Gly -His- Glu_ 
1 5 
<210> SEQ ID NO: 20 
<211> LENGTH: 4 
<212> TYPE: PRT 
<213> ORGANISM { Consensus 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/67 1, 635A 



DATE: 06/20/2001 
TIME: 11:08:56 



212 
214 
215 
217 
218 
219 
220 



Input Set : 
Output Set: 



20 



N:\COPIES\ES.txt 
N:\CRF3\06202001\I6716 3 5A.raw 



<210> 
<211> 
<212> 
<213> 
222 <400> 
224 
225 
227 
228 
229 
230 
232 
234 
235 
237 
238 
239 
240 



21 



<400> SEQUENCE: 
Asp Glu Ala Gly 
1 

SEQ ID NO: 
LENGTH : 4 
TYPE : PRT^ 

ORGANISM/c:onsensus Sequence 
SEQUENCeS-£1— 
Asp Glu Ala Pro 
1 



22 



<210> SEQ ID NO 
<211> LENGTH: 5 
<212> TYPE: PRT 
<213> ORGANISM: 
<400> SEQUENCE: 
Asp Glu Phe Tyr Trp 
1 5 
<210> SEQ ID NO: 23 
<211> LENGTH: 8 
<212> TYPE: PRT 
<213> ORGANIStyl^Consens 




242 <400> SEQUENCER — 23 — . 



tis^Se^ence^ 



244 
245 
247 
248 
249 
250 
252 



Asp Glu Gly Ser Thr His Lys Arg 



1 

<210> 
<211> 
<212> 
<213> 
<400> 



SEQ ID NO 
LENGTH: 8 
TYPE : PRT 
ORGANISM 
SEQUENCE 



24 




254 Asp Glu Lys Arg His Ser Thr Ala 

255 1 5 

257 <210> SEQ ID NO: 25 

258 <211> LENGTH: 4 

259 <212> TYPE: PRT 

"260 <213> ORGANISM^ Consensus' Sequence 
262 <400> SEQUENCET>^25_ 

264 Asp Glu Asn Gly 

265 1 

267 <210> SEQ ID NO: 26 

268 <211> LENGTH: 6 

269 <212> TYPE: PRT^- 

270 <213> ORGANISM(^Consensus x Sequence 
272 <400> SEQUENCE :2^___ 

-2-74 -Asp-Glu-Asni Lys.-Ala_.Cys. 
275 1 5 

<210> SEQ ID NO: 27 
<211> LENGTH: 6 
<212> TYPE: PR_^ ^ 



277 
278 
279 
280 



<213> ORGANI^^C onsensus Sequence^ 



3 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/671 , 635A 



DATE: 06/20/2001 
TIME: 11:08:56 



Input Set : N:\COPIES\ES.txt 

Output Set: N:\CRF3\06202001\I671635A.raw 



<400> SEQUENCE: 27 
Asp Glu Asn Lys His Ser 
1 5 
<210> SEQ ID NO: 28 
<211> LENGTH: 5 
<212> TYPE: PRT 

<213> ORGANISI^f^cBnsensus Sequence 



282 
284 
285 
287 
288 
289 
290 

292 <400> SEQUENCE!>-^& 
294 
295 
297 
298 
299 
300 
302 
304 
305 
307 
308 
309 



Consensus Sequence 



30 



p&r" ^ 

SMS^Cons 




Asp Glu Asn Lys Ser 
1 5 
<210> SEQ ID NO: 29 
<211> LENGTH': 4 
<212> TYPE: PRT 
<213> ORGANIS 
<400> SEQUENCE^ 
Asp Glu Asn Leu 
1 

<210> SEQ ID NO: 
<211> LENGTH: 6 
<212> TYPE: P 
310 <213> ORGANIS^^S^Cc^lsensus Sequence 
312 <400> SEQUENCE: 30 

314 Asp Glu Asn Pro His Lys 

315 1 5 

317 <210> SEQ ID NO: 31 

318 <211> LENGTH: 4 

319 <212> TYPE: PRT^- N 

320 <213> ORGANISM ^Consensus ^✓Se'quence ^\ 
322 <400> SEQUENCE :^ ^^^J 

324 Asp Glu Asn Gin 

325 1 

327 <210> SEQ ID NO 

328 <211> LENGTH: 7 

329 <212> TYPE: PR 

330 <213> "ORGANISM 
332 <400> SEQUENCE: 

334 Asp Glu Asn Gin Ala Arg Lys 

335 1 5 
<210> SEQ ID NO: 33 
<211> LENGTH: 6 
<212> TYPE: PR 

<213> ORGANISMv^Consen^us Sequence, 
<400> SEQUENCE: 



32 




Consensus-^Sequence 
3"! 



337 
338 
339 
340 
342 



PRT^ -n 

SWv^Const 
:E: 33 ' 



-3 4 4- Asp -Gl u- A s n-G 1 n.-G l.y_ Al a. 
345 1 5 

347 <210> SEQ ID NO: 34 

348 <211> LENGTH: 8 

349 <212> TYPE: PRT 

350 <213> ORGANISM C Con^eTTsus Sequence 



The types of errors shown exist throughout the Sequence Listing. Please check 
subsequent sequences for similar errors. 
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VERIFICATION SUMMARY DATE : 06/20/2001 

PATENT APPLICATION: US/09/671 , 63 5A TIME: 11:08:57 

Input Set : N:\COPIES\ES.txt 

Output Set: N:\CRF3\06202001\I671635A.raw 
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